22. San Francisco, CA.
I've only been programming for a year, but it already feels like a lifetime. Palautatan means "statistics" in Tagalog.

View Home Page | View My GitHub Profile
assignment2-Copy1

Assignment 2

Part 1: Image Processing Basics

Computers use tiny dots called pixels to display images. Each pixel is stored as an array of numbers that represent color intensities.

Example. In an 8-bit grayscale image, each pixel is a single number. The number represents light intensity ranging from black (0) to white (255).

Example. In a 24-bit RGB color image, each pixel is an array of 3 numbers. These numbers range from 0 to 255 and represent red, green, and blue intensity, respectively. For instance, (0, 0, 255) is bright blue and (255, 128, 0) is orange.

In this assignment, you'll use Python and NumPy to manipulate 24-bit RGB color images.

You can use Image.open() from the Python imaging library (PIL) to open an image:

In [8]:
import matplotlib as plt
import os
from PIL import Image

# Cat image from https://unsplash.com/photos/FqkBXo2Nkq0
cat_img = Image.open("sharpnack.jpg")
cat_img.LOAD_TRUNCATED_IMAGES = True
In [4]:
import numpy as np

cat = np.array(cat_img)
In [5]:
def as_image(x):
    """Convert an ndarray to an Image.
    
    Args:
        x (ndarray): The array of pixels.
        
    Returns:
        Image: The Image object.
    """
    return Image.fromarray(np.uint8(x))


# FUNCTION TESTING
as_image(cat)
Out[5]:

Exercise 1.1. How many dimensions does the cat array have? What does each dimension represent?

Exercise 1.1

The dimensions that array of James is represented by is (2048, 1365, 3). These dimensions represent the number of nested lists and objects the array is made of. Within the list, i.e. cat[0], we have 2048 lists of lists, which stand for the amount of vertical pixels. Within those lists of lists, we have 1365 lists, which stands for the amount of horizontal pixels. Then within each of those 1365 lists, we have 3 entries of numbers which represent pixels in RGB.

In [6]:
print(cat.shape) # (267,400,3)
print(cat[0].shape) #(400,3)
print(cat[0][0].shape) #(3,)
(2048, 1365, 3)
(1365, 3)
(3,)

Exercise 1.2. Use .copy() to copy the cat array to a new variable. Swap the green and blue color channels in the copy. Display the result.

In [7]:
# EXERCISE 1.2

cat2 = cat.copy()
# cat2 == cat

# RGB -> R"BG"
for each_grouping in cat2:
    each_grouping[:,[0, 1, 2]] = each_grouping[:,[0, 2, 1]]

as_image(cat2)
Out[7]:

Exercise 1.3. Why is .copy() necessary in exercise 1.2? What happens if you don't use .copy()?

Exercise 1.3

The use of .copy() is necessary in the previous exercise because when you slice arrays in numpy, even when you're reassigning your array to a new name, it slices your original array too. If you don't use copy, then you will lose your original array.

Exercise 1.4. Flip the blue color channel from left to right. Display the resulting image. Hint: see the NumPy documentation on array manipulation routines.

In [16]:
# EXERCISE 1.4
cat3 = cat.copy()
cat3[:,:,2] = np.fliplr(cat[:,:,2])
as_image(cat3)
Out[16]:

Part 2: Singular Value Decomposition

Suppose $X$ is an $n \times p$ matrix (for instance, one color channel of the cat image). The singular value decomposition (SVD) factors $X$ as $X = UD V^T$, where:

  • $U$ is an $n \times n$ orthogonal matrix
  • $D$ is an $n \times p$ matrix with zeroes everywhere except the diagonal
  • $V$ is an $p \times p$ orthogonal matrix

Note that a matrix $A$ is orthogonal when $A^T A = I$ and $AA^T = I$.

Example. We can use NumPy to compute the SVD for a matrix:

In [9]:
x = np.array(
    [[0, 2, 3],
     [3, 2, 1]]
)
u, d, vt = np.linalg.svd(x)
# Here d is 2x2 because NumPy only returns the diagonal of D.
print "u is:\n", u, "\nd is:\n", d, "\nv^T is:\n", vt
u is:
[[-0.68145174 -0.73186305]
 [-0.73186305  0.68145174]] 
d is:
[ 4.52966162  2.54600974] 
v^T is:
[[-0.48471372 -0.62402665 -0.6128975 ]
 [ 0.80296442 -0.03960025 -0.59470998]
 [ 0.34684399 -0.78039897  0.52026598]]

If we let

  • $u_i$ denote the $i$th column of $U$
  • $d_i$ denote the $i$th diagonal element of $D$
  • $v_i$ denote the $i$th column of $V$

then we can write the SVD as $\ X = UDV^T = d_1 u_1 v_1^T + \ldots + d_m u_m v_m^T\ $ using the rules of matrix multiplication. In other words, the SVD decomposes $X$ into a sum!

If we eliminate some of the terms in the sum, we get a simple approximation for $X$. For instance, we could eliminate all but first 3 terms to get the approximation $X \approx d_1 u_1 v_1^T + d_2 u_2 v_2^T + d_3 u_3 v_3^T$. This is the same as if we:

  • Zero all but the first 3 diagonal elements of $D$ to get $D_3$, then compute $X \approx UD_3V^T$
  • Eliminate all but the first 3 columns of $V$ to get $p \times 3$ matrix $V_3$, then compute $X \approx UDV_3^T$

We always eliminate terms starting from the end rather than the beginning, because these terms contribute the least to $X$.

Why would we want to approximate a matrix $X$?

In statistics, principal components analysis uses this approximation to reduce the dimension (number of covariates) in a centered (mean 0) data set. The vectors $d_i u_i$ are called the principal components of $X$. The vectors $v_i^T$ are called the basis vectors. Note that both depend on $X$. The dimension is reduced by using the first $q$ principal components instead of the original $p$ covariates. In other words, the $n \times p$ data $X$ is replaced by the $n \times q$ data $UD_q = XV_q$

In computing, this approximation is sometimes used to reduce the number of bits needed to store a matrix (or image). If $q$ terms are kept, then only $nq + pq$ values (for $XV_q$ and $V_q^T$) need to be stored instead of the uncompressed $np$ values.

Exercise 2.1. Write the functions described below.

  • A function that takes a matrix $X$ and returns its principal component matrix $XV_q$ and basis matrix $V_q^T$. This function should also take the number of terms kept $q$ as an argument.

  • A function that takes a principal component matrix $XV_q$ and basis matrix $V_q^T$ and returns an approximation $\hat{X}$ for the original matrix.

As usual, make sure to document your functions. Test your function on the red color channel of the cat image. What's the smallest number of terms where the cat is still recognizable as a cat?

Exercise 2.1

This is obviously a subjective answer, but the smallest number of terms where I can personally recognize James with 15 terms. When we use lower than 15 terms, I think the cat gets too leveled in with the grass.

In [12]:
# FIRST FUNCTION
def pcBasisMatrix(matx, q):
    '''
    This function takes in a matrix x and returns its principal component matrix and basis matrix.
    The value q is the number of terms.
    '''
    u, d, vt = np.linalg.svd(matx)
    vq = np.transpose(vt)[:,:q]
    pcm = np.dot(matx, vq)
    bm = np.transpose(vq)
    return(pcm, bm)


# SECOND FUNCTION
def approxMatrix(pcm, bm):
    approx_x = np.dot(pcm,bm)
    return(approx_x)


# TRY ON RED CHANNEL
def scaryCat(matx, q):
    pcm, bm = pcBasisMatrix(matx, q)
    approx = approxMatrix(pcm, bm)
    return(approx)

# CHECK HOW MANY TERMS
as_image(scaryCat(cat[:,:,0],13))
Out[12]:

Exercise 2.2. You can check the number of bytes used by a NumPy array with the .nbytes attribute. How many bytes does the red color channel of the cat image use? How many bytes does the compressed version use when 10 terms are kept? What percentage of the original size is this?

Exercise 2.2

The original red color channel of James image uses 2,795,520 bytes.
The compressed red color channel of James image uses 273,040 bytes.
The compressed version is 9.76705586081% the size of the original version.

In [13]:
# EXERCISE 2.2
original_bytes = cat[:,:,0].nbytes
print("Original bytes: "+str(original_bytes))

pcm, bm = pcBasisMatrix(cat[:,:,0], 10)
compressed_bytes = pcm.nbytes+bm.nbytes
print("Compressed bytes: "+str(compressed_bytes))
Original bytes: 2795520
Compressed bytes: 273040
In [14]:
ratio = (float)(pcm.nbytes+bm.nbytes)/cat[:,:,0].nbytes
print("Percentage of Original Size: "+str(ratio*100)+"%")
Percentage of Original Size: 9.76705586081%
View Home Page | View My GitHub Profile